Dynamic system and method for content and topic based synchronization during presentations

ABSTRACT

Disclosed embodiments provide techniques for automatically synchronizing a visual presentation with a live presenter, Visual presentation slides are preprocessed to determine one or more topics for each slide. A topic index contains one or more topics corresponding to slides of the presentation. As a presenter provides a verbal presentation for corresponding slides, natural language processing analyzes the verbal presentation and creates one or more temporal verbal topic categories. The temporal verbal topic categories are used to search the topic index to find one or more slides that best match the current temporal verbal topic categories. In this way, the slides can automatically follow the discussion of the presenter, enabling improved presentations that can enhance the user experience, increase audience engagement, and improve the dissemination of information.

FIELD

Embodiments relate to dynamic systems and methods for content andtopic-based synchronization during presentations.

BACKGROUND

A common way for knowledge to be transferred in a school class, seminar,web conference, etc., is for a teacher or presenter to speak about atopic and use slides having visual information as a supplement.Conventionally, the slides are transitioned from one to another by thepresenter, or someone managing in the background, by pressing a buttonon a mouse, screen, or other controller. This can be inefficient, forexample, as sometimes the user may forget to transition the slides atthe appropriate time in the speech. Sometimes slide presentations can bevery long, such as 50 slides or more. This can make it difficult for auser to switch between the slides when, for example, an audience memberasks a question, the answer of which relates to a slide far in the queuefrom the currently-displayed slide. Accordingly, there exists a need forimprovement in technology relating to electronic slide presentations.

SUMMARY

In one aspect, there is provided a computer-implemented method forautomatic synchronization of a visual presentation comprising aplurality of slides with a verbal presentation, comprising: performing acomputer-generated topic analysis for each slide of the plurality ofslides; creating a topic index for the visual presentation, wherein thetopic index comprises an entry for each slide, wherein each entryincludes one or more topic keywords associated therewith; performing areal-time computerized natural language analysis of the verbalpresentation; deriving one or more temporal verbal topic categories fromthe real-time computerized natural language analysis; searching thetopic index for a best matching entry, based on the one or more temporalverbal topic categories; and rendering a slide from the plurality ofslides that corresponds to the best matching entry.

In another aspect, there is provided an electronic communication devicecomprising: a processor; a memory coupled to the processor, the memorycontaining instructions, that when executed by the processor, performthe steps of: performing a computer-generated topic analysis for eachslide from a visual presentation comprising a plurality of slides;creating a topic index for the visual presentation, wherein the topicindex comprises an entry for each slide, wherein each entry includes oneor more topic keywords associated therewith; performing a real-timecomputerized natural language analysis of a verbal presentation;deriving one or more temporal verbal topic categories from the real-timecomputerized natural language analysis; searching the topic index for abest matching entry, based on the one or more temporal verbal topiccategories; and rendering a slide from the plurality of slides thatcorresponds to the best matching entry.

In yet another aspect, there is provided a computer program product forautomatic synchronization of a visual presentation, for an electroniccomputing device comprising a computer readable storage medium havingprogram instructions embodied therewith, the program instructionsexecutable by a processor to cause the electronic computing device to:perform a computer-generated topic analysis for each slide from aplurality of slides; create a topic index for the visual presentation,wherein the topic index comprises an entry for each slide, wherein eachentry includes one or more topic keywords associated therewith; performa real-time computerized natural language analysis of a verbalpresentation; derive one or more temporal verbal topic categories fromthe real-time computerized natural language analysis; search the topicindex for a best matching entry, based on the one or more temporalverbal topic categories; and render a slide from the plurality of slidesthat corresponds to the best matching entry.

BRIEF DESCRIPTION OF THE DRAWINGS

Features of the disclosed embodiments will be more readily understoodfrom the following detailed description of the various aspects of theinvention taken in conjunction with the accompanying drawings.

FIG. 1 is a diagram for an environment of embodiments of the presentinvention.

FIG. 2 is a device in accordance with embodiments of the presentinvention.

FIG. 3 is a flowchart indicating process steps for visual presentationprocessing in accordance with embodiments of the present invention.

FIG. 4 is a flowchart indicating process steps for verbal presentationprocessing in accordance with embodiments of the present invention.

FIG. 5 is a flowchart indicating process steps for presentationsynchronization in accordance with embodiments of the present invention.

FIG. 6 is an exemplary slide presentation used to describe operation ofembodiments of the present invention.

FIG. 7 shows an example of a forward slide transition based on a verbalpresentation in accordance with embodiments of the present invention.

FIG. 8 shows an example of a backward slide transition based on a verbalpresentation in accordance with embodiments of the present invention.

FIG. 9 shows an example of rendering multiple slides simultaneouslybased on a verbal presentation in accordance with embodiments of thepresent invention.

FIG. 10 shows an example of disambiguation in accordance withembodiments of the present invention.

FIG. 11 shows an example of a dispersion analysis in accordance withembodiments of the present invention.

FIG. 12 shows an example of a bigram analysis in accordance withembodiments of the present invention.

FIG. 13 shows examples of entries in a topic index for a visualpresentation.

FIG. 14 shows an exemplary user interface for a web-conferencingembodiment of the present invention.

The drawings are not necessarily to scale. The drawings are merelyrepresentations, not necessarily intended to portray specific parametersof the invention. The drawings are intended to depict only exampleembodiments of the invention, and therefore should not be considered aslimiting in scope. In the drawings, like numbering may represent likeelements. Furthermore, certain elements in some of the figures may beomitted, or illustrated not-to-scale, for illustrative clarity.

DETAILED DESCRIPTION

Disclosed embodiments provide techniques for automatically synchronizinga visual presentation with a live presenter. Visual presentation slidesare preprocessed to determine one or more topics for each slide. A topicindex contains one or more topics corresponding to slides of thepresentation. As a presenter provides a verbal presentation forcorresponding slides, natural language processing analyzes the verbalpresentation and creates one or more temporal verbal topic categories.The temporal verbal topic categories are used to search the topic indexto find one or more slides that best match the current temporal verbaltopic categories. In this way, the slides can automatically follow thediscussion of the presenter, enabling improved presentations that canenhance the user experience, increase audience engagement, and improvethe dissemination of information.

Reference throughout this specification to “one embodiment,” “anembodiment,” “some embodiments”, or similar language means that aparticular feature, structure, or characteristic described in connectionwith the embodiment is included in at least one embodiment of thepresent invention. Thus, appearances of the phrases “in one embodiment,”“in an embodiment,” “in some embodiments”, and similar languagethroughout this specification may, but do not necessarily, all refer tothe same embodiment.

Moreover, the described features, structures, or characteristics of theinvention may be combined in any suitable manner in one or moreembodiments. It will be apparent to those skilled in the art thatvarious modifications and variations can be made to the presentinvention without departing from the spirit and scope and purpose of theinvention. Thus, it is intended that the present invention cover themodifications and variations of this invention provided they come withinthe scope of the appended claims and their equivalents. Reference willnow be made in detail to the preferred embodiments of the invention.

The terminology used herein is for the purpose of describing particularembodiments only and is not intended to be limiting of this disclosure.As used herein, the singular forms “a”, “an”, and “the” are intended toinclude the plural forms as well, unless the context clearly indicatesotherwise. Furthermore, the use of the terms “a”, “an”, etc., do notdenote a limitation of quantity, but rather denote the presence of atleast one of the referenced items. The term “set” is intended to mean aquantity of at least one. It will be further understood that the terms“comprises” and/or “comprising”, or “includes” and/or “including”, or“has” and/or “having”, when used in this specification, specify thepresence of stated features, regions, integers, steps, operations,elements, and/or components, but do not preclude the presence oraddition of one or more other features, regions, and/or elements.

FIG. 1 is a diagram 100 for an environment of embodiments of the presentinvention. Presentation synchronization system 102 comprises processor140, memory 142, and storage 144. System 102 is an electroniccommunication device. Instructions 147 for executing embodiments of thepresent invention are shown stored in memory 142. Presentationsynchronization system 102 is in communication with network 124. Inembodiments, network 124 may be the Internet, a wide area network (WAN),a local area network (LAN), a cloud network, a combination thereof, orany other suitable network. Client devices 104, 106, and 108 are also incommunication with network 124. Client device 108 is in communicationwith projector 130. In embodiments, a visual presentation may berendered by projecting, using projector 130, onto a large screen orwall, and/or displaying on a computer screen of one or more of theclient devices. Any suitable method of rendering is included within thescope of the present invention. Client devices may be smartphones,tablet computers, laptop computers, desktop computers, a combinationthereof, or other suitable devices.

FIG. 2 is a device 200 in accordance with embodiments of the presentinvention. Device 200 is shown as a simplified diagram of modules.Device 200 is an electronic computing device, Device 200 includes aprocessor 202, which is coupled to a memory 204. Memory 204 may includedynamic random access memory (DRAM), static random access memory (SRAM),magnetic storage, and/or a read only memory such as flash, EEPROM,optical storage, or other suitable memory. In some embodiments, thememory 204 may not be a transitory signal per se. Memory 204 includesinstructions, which when executed by the processor, implement steps ofthe present invention. In embodiments, device 200 may have multipleprocessors 202, and/or multiple cores per processor.

Device 200 may further include storage 206. In embodiments, storage 206may include one or more magnetic storage devices such as hard diskdrives (HDDs). Storage 206 may include one or more solid state drives(SSDs). Any other storage device may be included instead of, or inaddition to, those disclosed herein.

Device 200 further includes a user interface 208. In some embodiments,the user interface may include a display system, which may include oneor more displays, examples of which include a liquid crystal display(LCD), a plasma display, a cathode ray tube (CRT) display, a lightemitting diode (LED) display, an organic LED (OLED) display, or othersuitable display technology. The user interface 208 may include akeyboard, mouse, and/or a touch screen, incorporating a capacitive orresistive touch screen in some embodiments. In embodiments, the device200 includes a microphone 212 which may be used to receive speech from auser.

The device 200 further includes a communication interface 210. In someembodiments, the communication interface 210 may include a wirelesscommunication interface that includes modulators, demodulators, andantennas for a variety of wireless protocols including, but not limitedto, Bluetooth™, Wi-Fi, and/or cellular communication protocols forcommunication over a computer network. Any communication interface, nowknown or hereafter developed, may be substituted.

In some embodiments, the elements of the invention are executed solelyon a client device. In other embodiments, the elements of the inventionare executed on a server remotely from a client device. In someembodiments, some elements are executed on the client device, and othersare executed on the server.

FIG. 3 is a flowchart 300 indicating process steps for visualpresentation processing in accordance with embodiments of the presentinvention. Embodiments of the invention provide a computer-implementedmethod for automatic synchronization of a visual presentation,comprising a plurality of slides, with a verbal presentation.

At 302, the visual presentation is ingested into system 102 (FIG. 1).The visual presentation ingest may include importing of a corpus or rawtext, scraping of web pages linked to in the presentation, and/or anyother suitable process.

At 304, the ingested visual presentation is then preprocessed forindexing. This includes performing a computer-generated topic analysisfor each slide of the plurality of slides. A topic index, comprising atleast one entry for each slide, is created for the visual presentation.Each entry includes one or more topic keywords associated therewith.

At 306, the topic analysis for each slide comprises performing acomputerized natural language processing analysis of each slide of theplurality of slides. In embodiments, a real-time computerized naturallanguage analysis of the presentation is performed. The natural languageanalysis may include, but is not limited to, indexing, concordance, stopword processing, bigram processing, dispersion analysis, lexicalrichness analysis (ratio of distinct words to total words),disambiguation, part-of-speech analysis, anaphora resolution (theprocess of identifying what a pronoun or noun phrase refers to), or anyother suitable process.

In some embodiments, a real-time computerized natural language analysis306 comprises performing an entity detection process 308 on text datafrom one or more slides from the plurality of slides. A topic is thenderived based on each entity detected from the entity detection process.The derived topic and corresponding slide is recorded in the topicindex. The entity detection may include extraction, which is thedetection and preparation of named entity occurrences. The extractionphase includes POS (part of speech) tagging, tokenization, sentenceboundary detection, capitalization rules and in-document statistics. Theentity detection may further include noun identification, followed byidentifying a subset of nouns including proper nouns, and nouns deemedto be topically pertinent. The extracted entities may be used askeywords to populate a topic index.

In some embodiments, the computerized natural language analysis (306 ofFIG. 3) comprises performing a long word analysis at 312, which mayinclude a bigram analysis at 314. In some embodiments, performing acomputerized natural language analysis 306 comprises using a naive Bayesclassifier at 310. Other classifiers and/or machine learning techniquesmay be used with embodiments of the present invention. In someembodiments, a regression analysis may be used to derive a correlationbetween verbal topics and corresponding slides of the presentation.

At 318, the topic analysis for each slide comprises performing an imageanalysis for an image within a slide from the plurality of slides. Theimage analysis may include using edge detection processes, gradientdetection processes, and other suitable processes to detect and identifyobjects in an image. Based on the identification, one or more topickeywords may be assigned to the slide. For example, if a smiling face isdetected in a slide, a keyword of “happy” may be entered as an entry forthe slide in the index.

The topic index includes entries of topic keywords assigned to variousslides, The index may be comprised of database tables, such as thoseillustrated in FIG. 13. Any suitable configuration of the index isincluded within the scope of the invention.

FIG. 4 is a flowchart 400 indicating process steps for verbalpresentation processing in accordance with embodiments of the presentinvention. At block 450, uttered phrases are received from a clientdevice, e.g., 104, 106, or 108, via a computer network 124 at thepresentation synchronization system 102 (FIG. 1). The uttered phrasesmay be captured by a microphone, for example, microphone 212 (FIG. 2).The uttered phrases may be received in the form of a soundrepresentation such as PCM (Pulse Code Modulated) data, a WAV file, MP3file, compressed audio data, or other suitable audio format.

At block 452, a speech-to-text process is performed on the utteredphrase. As part of the speech-to-text process, the presentationsynchronization system may perform a phoneme extraction process toextract the phonemes from the audio data. In the English language, thereare 44 phonemes. Other languages may have more or fewer phonemes. Forexample, Spanish has 24 phonemes, while German has 45 phonemes. Avariety of tools, such as CMU Sphinx, are available for performingphoneme extraction. While the examples disclosed herein use examples inEnglish, disclosed embodiments may be adapted to operate with a varietyof languages, including, but not limited to, Spanish, French, Italian,German, Portuguese, Japanese, or Chinese.

The speech-to-text process may further include performing a homonymdisambiguation process. For example, some words are pronounced the sameway, such as eight and ate. The homonym disambiguation process may usetokenization and part-of-speech identification to determine the mostlikely word that matches the phonemes. For example, if a pronounprecedes a phoneme that sounds like “eight,” then the intended word ismost likely to be “ate.”

The speech-to-text process may consider dialects within a language. Forexample, in English, there are a variety of dialects (e.g., British,American, Canadian, and Australian English) which may each have aslightly different lexicon, The speech-to-text process may use thedialect in computing the probability of a phoneme or set of phonemesmatching a particular word.

At block 454, a scenario summary is generated for one or more of thereceived uttered phrases. An entity detection process is performed togenerate the scenario summary, The entity detection may includeextraction, which is the detection and preparation of named entityoccurrences. The extraction phase includes POS (part of speech) tagging,tokenization, sentence boundary detection, capitalization rules andin-document statistics. The entity detection may further include nounidentification, followed by identifying a subset of nouns includingproper nouns, and nouns deemed to be topically pertinent. The extractedentities may be used as keywords within the scenario summary. Thesekeywords can then be used as temporal verbal topic categories which canbe matched with topic keywords pertaining to a presentation. A temporalverbal topic category is a topic category pertaining to the verbalpresentation over a predetermined period of time. For example, if apresenter mentions a phrase P for X number of times within the past Yseconds, then the phrase P may be considered as a temporal verbal topiccategory for that point in the presentation. For example, if a presentermentioned the phrase “data compression” eight number of times within thepast 20 seconds, then temporal verbal topic category of “datacompression” is determined to exist. The temporal verbal topic categoryof “data compression” can then be used as an input to query the topicindex of the slides a presentation, and retrieve an appropriate slide(e.g. slide 608 of FIG. 6).

At block 456, a topic index is queried. A query of a topic database,using the scenario summary as an input to the query, is performed toobtain one or more relevant topics based on the keywords of the scenariosummary. In embodiments, the scenario summary is input to a mapping rulelibrary, which then retrieves a topic for the scenario summary. Thetopic index is then searched to retrieve the slide that best representsthe topic. This slide may then be rendered to the presentation audienceas the presenter speaks about that topic.

FIG. 5 is a flowchart 500 indicating process steps for presentationsynchronization in accordance with embodiments of the present invention.At 550, a computer-generated topic analysis is performed for each slideof the plurality of slides. At 552, a topic index, comprising an entryfor each slide, for the visual presentation is created. Each entryincludes one or more topic keywords associated therewith. At 553, areal-time computerized natural language analysis of a received verbalpresentation is performed. At 554, one or more temporal verbal topiccategories are derived from the real-time computer-generated naturallanguage analysis. At 556, the topic index is searched for a bestmatching entry. The best matching entry is based on a comparison of thederived one or more temporal verbal topic categories to the topickeywords stored as entries in the topic index. The best matching entryis the slide that is then queued for transition to for rendering. At558, an option to override the slide transition is provided to the user.Accordingly, it is determined whether a user override has been received.If not, at 564, a slide (from the plurality of slides) is rendered thatcorresponds to the best matching entry. At 562, a machine-learningalgorithm is updated, such that the algorithm, over time, can be trainedto better predict whether to remain or transition to a new slide. If auser override is received at 558, then at 560, the rendering of thecurrent slide is maintained, and at 562, the machine-learning algorithmis updated.

FIG. 6 is an exemplary visual slide presentation 600 used to describeoperation of embodiments of the present invention. The slidepresentation includes six slides—first slide 602, second slide 604,third slide 606, fourth slide 608, fifth slide 610, and sixth slide 612.They are queued by the software program to render in that order, unlessa user, or system 102 (FIG. 1), intervenes. Note that rendering slidesin the visual presentation may include projecting the slides onto alarge screen or wall. Rendering the visual presentation may includedisplaying the slides on a computer screen. Any suitable method ofrendering the visual presentation is included. In the example, all ofthe slides include words. The words are subjected to natural languageanalysis to detect topic keywords therefrom. The third slide includes animage 607 in addition to the text. The image 607 is analyzed by imageanalysis for topic keywords relevant to the detected image. The keywordsare stored as entries in a topic index corresponding to the visual slidepresentation.

FIG. 7 shows an example 700 of a forward slide transition based on averbal presentation in accordance with embodiments of the presentinvention. In the example, user 712 is speaking into a microphone 212(FIG. 2) with slides of the visual presentation 600 of FIG. 6 beingrendered on an electronic device (such as device 106 of FIG. 1). Duringthe user's verbal presentation, the user 712 utters the phrase 714: “NowI want to talk about the transistors we used . . . ”. Second slide 604is rendered on the screen at the time that the user utters that phrase.System 102 (FIG. 1) receives and processes the verbal utterance,identifying the word 716: “transistors” as a keyword. Based on acomparison of the keyword to the topic index corresponding to visualslide presentation 600, it is determined that the third slide 606 is thebest matching entry as the third slide has associated therewith thekeyword of “transistor”. Accordingly, system 102 (FIG. 1) transitionsfrom rendering the second slide 604 to the third slide 606. This occursin near real time, so it is very soon after the user 712 utters thephrase 714 that the transition is made so as to keep the slide currentlybeing rendered relevant to the real-time verbal speech of the user 712.

FIG. 8 shows an example 800 of a backward slide transition based on averbal presentation in accordance with embodiments of the presentinvention. In some embodiments, the best matching entry may be in aslide previously shown, or “backwards” in terms of the slide order.Accordingly, the slides do not have to transition only in the forwarddirection (to a slide which has not yet been previously shown). Slidesfrom earlier in the presentation can be transitioned to if the systemdetermines such slide(s) are relevant based on the current speech in theverbal presentation of the user.

In the example 800, user 812 is speaking with slides of the visualpresentation 600 of FIG. 6 being rendered on an electronic device (suchas 106 of FIG, 1), During the user's verbal presentation, as captured bymicrophone 212 (FIG. 2), the user utters the phrase 814: “Once again, wecan apply these techniques in a variety of computer storagetechnologies, such as flash and SRAM . . . ”. Fifth slide 610 isrendered on the screen at the time that the user utters that phrase.System 102 (FIG. 1) receives and processes the verbal utterance,identifying several words in the speech, including “computer storage,”“flash,” and “SRAM” as keywords in section 816 of phrase 814. Based on acomparison of the detected keywords to the entries in the topic index,it is determined that the second slide 604 is the best matching entrybecause slide 604 has the same keywords associated therewith.Accordingly, system 102 (FIG. 1) transitions from rendering the fifthslide 610 to the second slide 604. This occurs in near real time, so itis very soon after the user utters the phrase 814 that the transition ismade so as to keep the slide being projected relevant to the currentverbal speech of the user 812.

FIG. 9 shows an example 900 of rendering multiple slides simultaneouslybased on a verbal presentation in accordance with embodiments of thepresent invention. Some embodiments include detecting a second bestmatching entry (in addition to determining a best matching entry), andrendering the slide (from the plurality of slides) that corresponds tothe second best matching entry adjacent to the slide that corresponds tothe best matching entry. Accordingly, in some embodiments, more than oneslide (two or more) may be displayed on the screen at a time.

In the example 900, user 912 is speaking with slides of the visualpresentation 600 of FIG. 6 being rendered on an electronic device (suchas 106 of FIG. 1). The user 912 utters the phrase 914: “So that explainsthe transistor structure. Now I want to talk about the impact oncomputer storage density for both flash and SRAM technologies . . . ”.Third slide 606 is displayed on the screen at the time that the user 912utters that phrase. System 102 (FIG. 1) receives and processes theverbal utterance, identifying several words in the speech, including“computer storage,” “flash,” “density,” and “SRAM” as keywords. Based onthe detected keywords in section 916 of phrase 914, it is determinedthat second slide 604 is the best matching entry, and the fifth slide610 is the second best matching entry. Accordingly, system 102 (FIG. 1)transitions from rendering the third slide 606 to the rendering of boththe second slide 604 and the fifth slide 610. This occurs in near realtime, so it is very soon after the user 912 utters the phrase that thetransition is made so as to keep the slide being projected relevant tothe verbal speech of the verbal presentation. When a plurality of slidesis displayed simultaneously, such slides may be depicted next to oneanother, or one above the other, or any suitable configuration.

FIG. 10 shows an example 1000 of disambiguation in accordance withembodiments of the present invention. Disambiguation is one of thecomputerized natural language analysis processes that may be utilized inembodiments of the present invention. As part of content ingest andanalysis of the visual presentation or verbal presentation processing,text may be tokenized into words and tagged with parts of speech. Forsome words, there can be more than one meaning and/or part of speech.

Example 1000 shows a disambiguation example with the word “saw.” Inphrase 1001, the word “saw” 1002 is a past tense verb. In embodiments, amachine learning natural language analysis module may identify the priortoken 1004 to the word “saw” as a pronoun, and the following token 1003as an article. In training a classifier, the pattern ofpronoun-token-article may be associated with a verb, and thus the tokenis interpreted as a verb.

In phrase 1005, the word “saw” 1006 is a noun for a cutting tool. Inembodiments, a machine learning natural language analysis module mayidentify the prior token 1008 to the word saw as an article, and thefollowing token 1009 as a verb. In training a classifier, the patternarticle-token-verb may be associated with a noun, and thus the token isinterpreted as a noun.

In phrase 1011, the word “saw” 1010 is an infinitive verb. Inembodiments, a machine learning natural language analysis module mayidentify the prior token 1012 to the word “saw” as part of an infinitiveform, and the following token 1015 as an article. In training aclassifier, the pattern “to”-token-article may be associated with averb, and thus the token is interpreted as a verb. These classifiers andtechniques for disambiguation are examples, and other classifiers andtechniques are possible.

FIG. 11 shows an example 1100 of a dispersion analysis in accordancewith embodiments of the present invention. In some embodiments, theperforming a computerized natural language analysis (306 of FIG. 3)comprises performing a dispersion analysis. In a multiple slidepresentation having 97 slides, a particular word may have a non-uniformdistribution within the slide. In the example, a dispersion analysis isperformed for the word “compression” 1102 within the slides of thepresentation. A graph comprises a horizontal axis 1106 representing aslide number within the presentation, and a vertical axis 1104representing a number of occurrences of word 1102 in the presentation.As can be seen in the graph, the presence of the word 1102 isconcentrated in certain slides. A maximum concentration 1108 isidentified in the area around slide 65. In embodiments, slides inproximity to the maximum concentration of the dispersion analysis havethe word 1102 loaded into the respective topic indexes. The dispersionanalysis can provide an indication of relevance. In the example 1100,the keyword (category) 1102 is “compression.” Thus, in this example,using the word “compression,” and keywords in the topic index relatingto slides at or near slide 65 are deemed relevant, then the slides at ornear slide 65 may be selected to match the verbal phrases currentlyuttered by a presenter.

FIG. 12 shows an example 1200 of a bigram analysis in accordance withembodiments of the present invention. In some embodiments, the topicanalysis for each slide comprises a long word analysis (312 of FIG. 3).The long word analysis may include performing a bigram analysis (314 ofFIG. 3). In a bigram analysis, a pair of words in a particular order maybe searched for within a body of text of an input query and/or a verbalutterance. Based on the analyses, one or ore topic keywords may beassigned to the slide. In this example, the bigram “computer storage” issearched within a text excerpt. Three occurrences, indicated as 1202A,1202B, and 1202C are present in the text passage. In embodiments, theusage of bigrams, trigrams, or more generally, n-grams (number=n), maybe used to improve relevance in searching the text of a presentationand/or processing the verbal language uttered by a presenter.

FIG. 13 shows examples of entries in a topic index for a visualpresentation. In the example, the topic index comprises database tablesincluding topic keywords from slides on the right that are inferredbased on context from natural language processing or determined fromimage analysis of images on the slides. Database table 1314 stores thekeywords detected from slide 1304. The keywords include “ComputerStorage,” “Flash,” “SRAM”, and “Memory.” Database table 1316 stores thekeywords detected from slide 1306. The keywords include “Transistor,”“NFET,” “PFET” and “Silicon.” Database table 1318 stores the keywordsdetected from slide 1308. The keywords include “Data Compression,”“LZMA,” “GZIP,” and “Huffman Coding.” The database tables may be storedin system 102 (FIG. 1), or completely or partially on a client device104, 106, or 108. The database tables may be relational from SQL,hierarchical, or any other suitable configuration.

FIG. 14 shows an exemplary user interface 1400 for a web-conferencingembodiment of the present invention In this example, the presentation isa web conference, and slides are rendered by displaying on a computerscreen (user interface) of “attendee” client devices (such as 104, 106,and 108 of FIG. 1). In the example, the user interface 1400 is thecontrolling user interface. This is the screen of the electronic deviceof the presenting user or other person managing the slides in real time,and presents the currently displayed slide, the slide corresponding tothe best matching entry, and an override option. Fifth slide 1402 isindicated as currently displayed. Second slide 1404 is determined as thebest matching entry as the presenting user speaks. Accordingly, system102 (FIG. 1) indicates that it is recommended that a transition to thesecond slide 1404 be performed. The user override option is presented asa question: “Change to the new slide” with a set of buttons 1406 and1408 by which the user can indicate “yes” or “no,” respectively. If theuser answers, “no,” the transition recommendation is overridden. If theuser answers “yes,” the transition is implemented. It should berecognized that any suitable mechanism by which to accept a useroverride is included within the scope of the invention.

Embodiments of the present invention improve the technical field ofelectronic communication. Using techniques of disclosed embodiments,improved synchronization of visual presentation material and livespeaker discussion is achieved. By improving synchronization, errors inslide presentation are reduced, which enables the overall length of apresentation to be reduced. This saves energy on projectors, mobiledevices, servers, and other associated equipment used for presentations.Furthermore, for web-conferencing embodiments, a savings in networkresources is also achieved.

As can now be appreciated, disclosed embodiments provide techniques forautomatically synchronizing a visual presentation with a live presenter.Visual presentation slides are preprocessed to determine one or moretopics for each slide. A topic index contains one or more topicscorresponding to slides of the presentation. As a presenter provides averbal presentation (i.e. a discussion) for corresponding slides,natural language processing analyzes the verbal presentation and createsone or more temporal verbal topic categories. The temporal verbal topiccategories are used to search the topic index to find one or more slidesthat best match the current temporal verbal topic categories. In thisway, the slides can automatically follow the discussion of thepresenter, enabling improved presentations that can enhance the userexperience, increase audience engagement, and improve the disseminationof information. Machine learning techniques with user feedbackmechanisms can be used to improve accuracy of the system over time.Furthermore, in addition to slide presentations, other types of visualpresentations may be used in some embodiments, including, but notlimited to, video clips, software demonstrations, video gamingapplications, and/or tutorials.

Some of the functional components described in this specification havebeen labeled as systems or units in order to more particularly emphasizetheir implementation independence. For example, a system or unit may beimplemented as a hardware circuit comprising custom VLSI circuits orgate arrays, off-the-shelf semiconductors such as logic chips,transistors, or other discrete components. A system or unit may also beimplemented in programmable hardware devices such as field programmablegate arrays, programmable array logic, programmable logic devices, orthe like. A system or unit may also be implemented in software forexecution by various types of processors. A system or unit or componentof executable code may, for instance, comprise one or more physical orlogical blocks of computer instructions, which may, for instance, beorganized as an object, procedure, or function. Nevertheless, theexecutables of an identified system or unit need not be physicallylocated together, but may comprise disparate instructions stored indifferent locations which, when joined logically together, comprise thesystem or unit and achieve the stated purpose for the system or unit.

Further, a system or unit of executable code could be a singleinstruction, or many instructions, and may even be distributed overseveral different code segments, among different programs, and acrossseveral memory devices. Similarly, operational data may be identifiedand illustrated herein within modules, and may be embodied in anysuitable form and organized within any suitable type of data structure.The operational data may be collected as a single data set, or may bedistributed over different locations including over different storagedevices and disparate memory devices.

Furthermore, systems/units may also be implemented as a combination ofsoftware and one or more hardware devices. For instance, locationdetermination and alert message and/or coupon rendering may be embodiedin the combination of a software executable code stored on a memorymedium (e.g., memory storage device). In a further example, a system orunit may be the combination of a processor that operates on a set ofoperational data.

As noted above, some of the embodiments may be embodied in hardware. Thehardware may be referenced as a hardware element. In general, a hardwareelement may refer to any hardware structures arranged to perform certainoperations. In one embodiment, for example, the hardware elements mayinclude any analog or digital electrical or electronic elementsfabricated on a substrate. The fabrication may be performed usingsilicon-based integrated circuit (IC) techniques, such as complementarymetal oxide semiconductor (CMOS), bipolar, and bipolar CMOS (BiCMOS)techniques, for example. Examples of hardware elements may includeprocessors, microprocessors, circuits, circuit elements (e.g.,transistors, resistors, capacitors, inductors, and so forth), integratedcircuits, application specific integrated circuits (ASIC), programmablelogic devices (PLD), digital signal processors (DSP), field programmablegate array (FPGA), logic gates, registers, semiconductor devices, chips,microchips, chip sets, and so forth. However, the embodiments are notlimited in this context.

Also noted above, some embodiments may be embodied in software. Thesoftware may be referenced as a software element. In general, a softwareelement may refer to any software structures arranged to perform certainoperations. In one embodiment, for example, the software elements mayinclude program instructions and/or data adapted for execution by ahardware element, such as a processor. Program instructions may includean organized list of commands comprising words, values, or symbolsarranged in a predetermined syntax that, when executed, may cause aprocessor to perform a corresponding set of operations.

The present invention may be a system, a method, and/or a computerprogram product at any possible technical detail level of integration.The computer program product may include a computer readable storagemedium (or media) having computer readable program instructions thereonfor causing a processor to carry out aspects of the present invention.

The computer readable storage medium can be a tangible device that canretain and store instructions for use by an instruction executiondevice. The computer readable storage medium may be, for example, but isnot limited to, an electronic storage device, a magnetic storage device,an optical storage device, an electromagnetic storage device, asemiconductor storage device, or any suitable combination of theforegoing. A non-exhaustive list of more specific examples of thecomputer readable storage medium includes the following: a portablecomputer diskette, a hard disk, a random access memory (RAM), aread-only memory (ROM), an erasable programmable read-only memory (EPROMor Flash memory), a static random access memory (SRAM), a portablecompact disc read-only memory (CD-ROM), a digital versatile disk (DVD),a memory stick, a floppy disk, a mechanically encoded device such aspunch-cards or raised structures in a groove having instructionsrecorded thereon, and any suitable combination of the foregoing. Acomputer readable storage medium, as used herein, may be non-transitory,and thus is not to be construed as being transitory signals per se, suchas radio waves or other freely propagating electromagnetic waves,electromagnetic waves propagating through a waveguide or othertransmission media (e.g., light pulses passing through a fiber-opticcable), or electrical signals transmitted through a wire.

Computer readable program instructions described herein can bedownloaded to respective computing/processing devices from a computerreadable storage medium or to an external computer or external storagedevice via a network, for example, the Internet, a local area network, awide area network and/or a wireless network. The network may comprisecopper transmission cables, optical transmission fibers, wirelesstransmission, routers, firewalls, switches, gateway computers and/oredge servers. A network adapter card or network interface in eachcomputing/processing device receives computer readable programinstructions from the network and forwards the computer readable programinstructions for storage in a computer readable storage medium withinthe respective computing/processing device. Program data may also bereceived via the network adapter or network interface.

Computer readable program instructions for carrying out operations ofembodiments of the present invention may be assembler instructions,instruction-set-architecture (ISA) instructions, machine instructions,machine dependent instructions, microcode, firmware instructions,state-setting data, or either source code or object code written in anycombination of one or more programming languages, including an objectoriented programming language such as Smalltalk, C++ or the like, andconventional procedural programming languages, such as the “C”programming language or similar programming languages. The computerreadable program instructions may execute entirely on the user'scomputer, partly on the user's computer, as a stand-alone softwarepackage, partly on the user's computer and partly on a remote computer,or entirely on the remote computer or server. In the latter scenario,the remote computer may be connected to the user's computer through anytype of network, including a local area network (LAN) or a wide areanetwork (WAN), or the connection may be made to an external computer(for example, through the Internet using an Internet Service Provider).In some embodiments, electronic circuitry including, for example,programmable logic circuitry, field-programmable gate arrays (FPGA), orprogrammable logic arrays (PLA) may execute the computer readableprogram instructions by utilizing state information of the computerreadable program instructions to personalize the electronic circuitry,in order to perform aspects of embodiments of the present invention.

These computer readable program instructions may be provided to aprocessor of a computer, or other programmable data processing apparatusto produce a machine, such that the instructions, which execute via theprocessor of the computer or other programmable data processingapparatus, create means for implementing the functions/acts specified inthe flowchart and/or block diagram block or blocks. These computerreadable program instructions may also be stored in a computer readablestorage medium that can direct a computer, a programmable dataprocessing apparatus, and/or other devices to function in a particularmanner, such that the computer readable storage medium havinginstructions stored therein comprises an article of manufactureincluding instructions which implement aspects of the function/actspecified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto acomputer, other programmable data processing apparatus, or other deviceto cause a series of operational steps to be performed on the computer,other programmable apparatus or other device to produce a computerimplemented process, such that the instructions which execute on thecomputer, other programmable apparatus, or other device implement thefunctions/acts specified in the flowchart and/or block diagram block orblocks.

While the disclosure outlines exemplary embodiments, it will beappreciated that variations and modifications will occur to thoseskilled in the art. For example, although the illustrative embodimentsare described herein as a series of acts or events, it will beappreciated that the present invention is not limited by the illustratedordering of such acts or events unless specifically stated. Some actsmay occur in different orders and/or concurrently with other acts orevents apart from those illustrated and/or described herein, inaccordance with the invention. In addition, not all illustrated stepsmay be required to implement a methodology in accordance withembodiments of the present invention. Furthermore, the methods accordingto embodiments of the present invention may be implemented inassociation with the formation and/or processing of structuresillustrated and described herein as well as in association with otherstructures not illustrated. Moreover, in particular regard to thevarious functions performed by the above described components(assemblies, devices, circuits, etc.), the terms used to describe suchcomponents are intended to correspond, unless otherwise indicated, toany component which performs the specified function of the describedcomponent (i.e., that is functionally equivalent), even though notstructurally equivalent to the disclosed structure which performs thefunction in the herein illustrated exemplary embodiments of theinvention. In addition, while a particular feature of embodiments of theinvention may have been disclosed with respect to only one of severalembodiments, such feature may be combined with one or more features ofthe other embodiments as may be desired and advantageous for any givenor particular application. Therefore, it is to be understood that theappended claims are intended to cover all such modifications and changesthat fall within the true spirit of embodiments of the invention.

1. A computer-implemented method for automatic synchronization of avisual presentation comprising a plurality of slides with a verbalpresentation, comprising: performing a computer-generated topic analysisfor each slide of the plurality of slides; creating a topic index forthe visual presentation, wherein the topic index comprises an entry foreach slide, wherein each entry includes one or more topic keywordsassociated therewith; performing a real-time computerized naturallanguage analysis of the verbal presentation; deriving one or moretemporal verbal topic categories from the real-time computerized naturallanguage analysis; searching the topic index for a best matching entry,based on the one or more temporal verbal topic categories; and renderinga slide from the plurality of slides that corresponds to the bestmatching entry.
 2. The method of claim 1, wherein performing acomputer-generated topic analysis for each slide comprises performing acomputerized natural language processing analysis of each slide of theplurality of slides.
 3. The method of claim 1, wherein performing acomputer-generated topic analysis includes performing an image analysisfor an image within a slide from the plurality of slides.
 4. The methodof claim 1, wherein performing a computerized natural language analysiscomprises: performing an entity detection process on text data from oneor more slides from the plurality of slides; deriving a topic based oneach entity detected from the entity detection process; and recordingthe derived topic and corresponding slide in the topic index.
 5. Themethod of claim 4, further comprising: detecting a second best matchingentry; and rendering the slide from the plurality of slides thatcorresponds to the second best matching entry adjacent to the slide thatcorresponds to the best matching entry.
 6. The method of claim 2,wherein performing a computerized natural language analysis comprisesperforming a long word analysis.
 7. The method of claim 2, whereinperforming a computerized natural language analysis comprises performinga dispersion analysis.
 8. The method of claim 2, wherein performing acomputerized natural language analysis comprises performing a bigramanalysis.
 9. The method of claim 2, wherein performing a computerizednatural language analysis process comprises using a naive Bayesclassifier. 10-20. (canceled)